37 research outputs found
Intent Models for Contextualising and Diversifying Query Suggestions
The query suggestion or auto-completion mechanisms help users to type less
while interacting with a search engine. A basic approach that ranks suggestions
according to their frequency in the query logs is suboptimal. Firstly, many
candidate queries with the same prefix can be removed as redundant. Secondly,
the suggestions can also be personalised based on the user's context. These two
directions to improve the aforementioned mechanisms' quality can be in
opposition: while the latter aims to promote suggestions that address search
intents that a user is likely to have, the former aims to diversify the
suggestions to cover as many intents as possible. We introduce a
contextualisation framework that utilises a short-term context using the user's
behaviour within the current search session, such as the previous query, the
documents examined, and the candidate query suggestions that the user has
discarded. This short-term context is used to contextualise and diversify the
ranking of query suggestions, by modelling the user's information need as a
mixture of intent-specific user models. The evaluation is performed offline on
a set of approximately 1.0M test user sessions. Our results suggest that the
proposed approach significantly improves query suggestions compared to the
baseline approach.Comment: A short version of this paper was presented at CIKM 201
Generalized Team Draft Interleaving
Interleaving is an online evaluation method that compares
two ranking functions by mixing their results and interpret-
ing the users' click feedback. An important property of
an interleaving method is its sensitivity, i.e. the ability to
obtain reliable comparison outcomes with few user interac-
tions. Several methods have been proposed so far to im-
prove interleaving sensitivity, which can be roughly divided
into two areas: (a) methods that optimize the credit assign-
ment function (how the click feedback is interpreted), and
(b) methods that achieve higher sensitivity by controlling
the interleaving policy (how often a particular interleaved
result page is shown).
In this paper, we propose an interleaving framework that
generalizes the previously studied interleaving methods in
two aspects. First, it achieves a higher sensitivity by per-
forming a joint data-driven optimization of the credit as-
signment function and the interleaving policy. Second, we
formulate the framework to be general w.r.t. the search do-
main where the interleaving experiment is deployed, so that
it can be applied in domains with grid-based presentation,
such as image search. In order to simplify the optimization,
we additionally introduce a stratifed estimate of the exper-
iment outcome. This stratifcation is also useful on its own,
as it reduces the variance of the outcome and thus increases
the interleaving sensitivity.
We perform an extensive experimental study using large-
scale document and image search datasets obtained from
a commercial search engine. The experiments show that
our proposed framework achieves marked improvements in
sensitivity over efective baselines on both datasets
Emergent Language Generalization and Acquisition Speed are not tied to Compositionality
Studies of discrete languages emerging when neural agents communicate to
solve a joint task often look for evidence of compositional structure. This
stems for the expectation that such a structure would allow languages to be
acquired faster by the agents and enable them to generalize better. We argue
that these beneficial properties are only loosely connected to
compositionality. In two experiments, we demonstrate that, depending on the
task, non-compositional languages might show equal, or better, generalization
performance and acquisition speed than compositional ones. Further research in
the area should be clearer about what benefits are expected from
compositionality, and how the latter would lead to them
Long-term Effects of Temperature Variations on Economic Growth: A Machine Learning Approach
This study investigates the long-term effects of temperature variations on
economic growth using a data-driven approach. Leveraging machine learning
techniques, we analyze global land surface temperature data from Berkeley Earth
and economic indicators, including GDP and population data, from the World
Bank. Our analysis reveals a significant relationship between average
temperature and GDP growth, suggesting that climate variations can
substantially impact economic performance. This research underscores the
importance of incorporating climate factors into economic planning and
policymaking, and it demonstrates the utility of machine learning in uncovering
complex relationships in climate-economy studies
Optimised Scheduling of Online Experiments
ABSTRACT Modern search engines increasingly rely on online evaluation methods such as A/B tests and interleaving. These online evaluation methods make use of interactions by the search engine's users to test various changes in the search engine. However, since the number of the user sessions per unit of time is limited, the number of simultaneously running on-line evaluation experiments is bounded. In an extreme case, it might be impossible to deploy all experiments since they arrive faster than are proccessed. Consequently, it is very important to efficiently use the limited resource of the user's interactions. In this paper, we formulate the novel problem of schedule optimisation for the queue of the online experiments: given a limited number of the user interactions available for experimentation, we want to re-order the queue so that the number of successful experiments is maximised. In order to build a schedule optimisation algorithm, we start by formulating a model of an online experimentation pipeline. Next, we propose to reduce the task of finding the optimal schedule to a learning-to-rank problem, where we require the most promising experiments to be ranked first in the schedule. To evaluate the proposed approach, we perform an evaluation study using two datasets containing 82 interleaving and 35 A/B test experiments, performed by a commercial search engine. We measure the quality of a schedule as the number of successful experiments executed under limited user interactions. Our proposed schedulers obtain improvements of up to 342% compared to the unoptimised baseline schedule on the dataset of interleaving experiments and up to 43% on the dataset of A/B tests
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Contrastive Predictive Coding (CPC), based on predicting future segments of
speech based on past segments is emerging as a powerful algorithm for
representation learning of speech signal. However, it still under-performs
other methods on unsupervised evaluation benchmarks. Here, we introduce
WavAugment, a time-domain data augmentation library and find that applying
augmentation in the past is generally more efficient and yields better
performances than other methods. We find that a combination of pitch
modification, additive noise and reverberation substantially increase the
performance of CPC (relative improvement of 18-22%), beating the reference
Libri-light results with 600 times less data. Using an out-of-domain dataset,
time-domain data augmentation can push CPC to be on par with the state of the
art on the Zero Speech Benchmark 2017. We also show that time-domain data
augmentation consistently improves downstream limited-supervision phoneme
classification tasks by a factor of 12-15% relative
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
We propose using self-supervised discrete representations for the task of
speech resynthesis. To generate disentangled representation, we separately
extract low-bitrate representations for speech content, prosodic information,
and speaker identity. This allows to synthesize speech in a controllable
manner. We analyze various state-of-the-art, self-supervised representation
learning methods and shed light on the advantages of each method while
considering reconstruction quality and disentanglement properties.
Specifically, we evaluate the F0 reconstruction, speaker identification
performance (for both resynthesis and voice conversion), recordings'
intelligibility, and overall quality using subjective human evaluation. Lastly,
we demonstrate how these representations can be used for an ultra-lightweight
speech codec. Using the obtained representations, we can get to a rate of 365
bits per second while providing better speech quality than the baseline
methods. Audio samples can be found under the following link:
speechbot.github.io/resynthesis.Comment: In Proceedings of Interspeech 202